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Abstract 

We  present  a  method  for  predicting  the  failure  rate,  and  thus  the  reliability  of  an  electronic 
system  by  summing  the  failure  rate  of  each  known  failure  mechanism.  We  combine  the 
physics  of  failure  for  each  mechanism  with  their  effects  as  observed  by  High/Low 
temperature  and  High/Low  voltage  and  current  stresses.  Our  method  assumes  that  lifetime 
of  each  of  its  failure  mechanisms  follows  constant  rate  distribution  and  each  mechanism  is 
independently  accelerated  by  the  stress  factors  that  can  be  entered  into  a  reliability  model. 
The  overall  failure  rate  is  thus,  also  follows  an  exponential  distribution  and  is  described  in 
the  standard  FIT  (Failure  unIT  or  Failure  In  Time),  The  method  combines  mathematical 
models  for  known  failure  mechanism  and  solves  them  simultaneously  at  a  multiplicity  of 
accelerated  life  tests  to  find  a  consistent  set  of  weighting  factors  for  each  mechanism.  The 
result  of  solving  the  system  of  equations  is  a  more  accurate  and  a  unique  combination  for 
each  system  model  by  proportional  summation  of  each  of  the  contributing  failure 
mechanisms. 
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I.  Introduction 


To  this  day,  the  users  of  our  most  sophistieated  eleetronie  systems  that  inelude  opto- 
eleetronie,  photonie,  Miero-EleetroMeehaneial  Systems  (MEMS)  deviee,  ete.  are  expeeted  to  rely 
on  a  simple  reliability  value  for  the  Eailure  In  Time  (EIT)  published  by  the  supplier.  The  EIT  is 
determined  today  in  the  produet  qualification  process  by  use  of  High  Temperature  Over-voltage 
Eife-test  (HTOE)  or  other  standardized  test,  depending  on  the  product.  The  manufacturer  reports 
a  zero-failure  result  from  the  given  conditions  of  the  single-point  test  and  uses  a  single¬ 
mechanism  model  to  fit  an  expected  Mean  Time  To  Eailure  (MTTF)  at  the  operator’s  use 
conditions.  The  zero-failure  qualification  is  well  known  as  a  very  expensive  exercise  that 
provides  nearly  no  useful  information.  As  a  result,  designers  often  rely  on  Highly  Accelerated 
Eife  Test  (HAET)  testing  and  on  handbooks  such  as  Tides,  Telecordia  or  Mil  Handbook  217  to 
estimate  the  failure  rate  of  their  products,  knowing  full  well  that  these  approaches  act  as 
guidelines  rather  than  as  a  reliable  prediction  tool.  Furthermore,  with  zero  failure  required  for  the 
“pass”  criterion  as  well  as  the  poor  correlation  of  expensive  HTOE  data  to  test  and  field  failures, 
there  is  no  way  for  the  designers  to  utilize  this  knowledge  in  order  to  build  in  reliability  or  to 
trade  it  off  with  performance.  Prediction  is  not  really  the  goal  of  these  tests,  however  current 
practice  is  to  assign  an  expected  failure  rate,  FIT,  based  only  on  this  test  even  if  the  presumed 
acceleration  factor  is  not  correct. 

This  paradigm  seems  unfortunate  since  the  manufacturers  of  our  electronic  equipment 
actually  put  a  great  deal  of  effort  and  spend  so  much  money  and  excellent  personnel  resources  to 
learn  and  study  each  failure  mechanism.  Today’s  approach  to  reliability  takes  the  intimate 
knowledge  of  the  failure  mechanisms  and  then  not  communicate  this  knowledge  downstream  to 
the  users,  usually  for  fear  that  perhaps  the  models  or  the  probabilistic  interpretation  will  not  be 
realized.  Hence,  known  and  already  characterized  mechanisms  that  could  lead  to  failure  are  left 
out  of  the  equation.  This  leaves  the  final,  sterilized,  accelerated  test  as  the  only  available 
assessment  on  which  the  user  can  rely.  Everyone  recognizes  that  the  resulting  calculation  is  far 
from  being  a  reliable  value  to  predict  anything  about  the  life.  Worse  than  that,  it  makes  a  joke  of 
the  reliability  prediction  process  and  has  lead  to  confusion  at  all  levels.  It  also  makes  a  joke  of  the 
very  excellent  and  hard  work  of  the  reliability  engineers  who  evaluate  the  failure  probabilities 
and  the  underlying  physics. 

We  have  found  that  a  practical  means  of  separating  electronic  device  failure  mechanisms  at 
the  system  level.  We  tested  Field  Programmable  Gate-Arrays  (FPGA’s)  with  a  large  range  of 
frequency  operation  and  tested  them  at  extremely  high  and  low  temperatures  with  voltages 
ranging  from  nominal  to  more  than  2  times  nominal  voltages.  The  result  is  the  ability  to 
completely  distinguish  the  influences  of  hot  carrier  injection  (HCI),  Electromigration  (EM),  Bias 
temperature  instability  (BTI)  and  time  dependent  dielectric  breakdown  (TDDB).  Our  result 
shows  that  a  meaningful  reliability  prediction  can  be  made  by  a  summation  of  distinct  intrinsic 
failure  mechanisms  as  measured  at  the  system  level.  The  result  of  our  work  will  be  a  system 
qualification  protocol  that  can  actually  predict  the  FIT  with  a  much  greater  accuracy  than  a 
standard  high  temperature  or  low  temperature  overstress  life  qualification. 

Chip  and  packaged  system  reliability  is  still  measured  by  a  failure  unit,  FIT.  The  FIT  is  a 
rate,  defined  as  the  number  of  expected  device  failures  per  billion  part  hours.  A  FIT  is  assigned 
for  each  component  multiplied  by  the  number  of  devices  in  a  system  for  an  approximation  of  the 
expected  system  reliability.  The  semiconductor  industry  provides  an  expected  FIT  for  every 
product  that  is  sold  based  on  operation  within  the  specified  conditions  of  voltage,  frequency,  heat 
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dissipation  and  etc.  Hence,  a  system  reliability  model  is  a  prediction  of  the  expected  mean  time 
between  failures  (MTBF)  for  an  entire  system  as  the  sum  of  the  FIT  rates  for  every  component. 

A  FIT  is  defined  in  terms  of  an  acceleration  factor,  Af,  as: 

FIT  = _ #failures _ 

#  tested  *  hours  * 

where  #failures  and  #tested  are  the  number  of  actual  failures  that  occurred  as  a  fraction  of  the 
total  number  of  units  subjected  to  an  accelerated  test.  Therefore,  the  failure  rate, 

FIT  =  IOVmTBF  .  (2) 

The  acceleration  factor,  Ap,  must  be  supplied  by  the  manufacturer  since  only  they  know  the 
failure  mechanisms  that  are  being  accelerated  in  the  High  Temperature  Operating  Life  (HTOL) 
and  it  is  generally  based  on  a  company  proprietary  variant  of  the  MIL-HDBK-217  approach  for 
accelerated  life  testing.  The  true  task  of  reliability  modeling,  therefore,  is  to  choose  an 
appropriate  value  for  Ap  based  on  the  physics  of  the  dominant  failure  mechanisms  that  would 
occur  in  the  field  for  the  device. 


11.  Standard  HTOL 

The  standard  HTOL  qualification  test  is  usually  performed  as  the  final  qualification  step  of  a 
semiconductor  manufacturing  process.  The  test  consists  of  stressing  some  number  of  parts, 
usually  about  100,  for  an  extended  time,  usually  1000  hours,  at  an  accelerated  voltage  and 
temperature.  Two  features  shed  doubt  on  the  accuracy  of  this  procedure.  One  feature  is  lack  of 
sufficient  statistical  data  and  the  second  is  that  companies  generally  present  zero  failures  results 
for  their  qualification  tests  and  hence  stress  their  parts  under  relatively  low  stress  levels  to 
guarantee  zero  failures  during  qualification  testing. 

Unfortunately,  with  zero  failures  no  statistical  data  is  acquired.  Another  feature  is  their 
calculation  of  the  acceleration  factor  Ap.  If  the  qualification  test  results  in  zero  failures,  which 
allows  the  assumption  (with  only  60%  confidence!)  that  no  more  than  14  a  failure  occurred  during 
the  accelerated  test.  This  would  result,  based  on  the  example  parameters,  in  a  reported  FIT  = 
5000/AF,  which  can  be  almost  any  value  from  less  than  1  FIT  to  more  than  500  FIT,  depending 
on  the  conditions  and  model  used  for  the  voltage  and  temperature  acceleration. 

The  accepted  approach  for  measuring  FIT  would  be  reasonably  correct  if  there  were  only  a 
single  dominant  failure  mechanism  that  is  excited  equally  by  either  voltage  or  temperature. 
Additionally,  this  same  mechanism  is  the  only  one  that  is  accelerated  by  the  burn-in  or 
accelerated  test.  For  example,  electromigration  is  known  to  follow  Black’s  equation  and  is 
accelerated  by  increased  stress  current  in  a  wire  or  by  increased  temperature  of  the  device.  If, 
however,  multiple  failure  mechanisms  are  responsible  for  device  failures,  each  failure  mechanism 
should  be  modeled  as  an  individual  “element”  in  the  system  and  the  component  survival  is 
modeled  as  the  survival  probability  of  all  the  “elements”  as  a  function  of  time  [1]. 

The  acceleration  of  a  single  failure  mechanism  is  a  highly  non-linear  function  of  temperature 
and/or  voltage.  The  temperature  acceleration  factor  {AFj)  and  voltage  acceleration  factor  {AFp) 
can  be  calculated  separately  and  is  the  subject  of  most  studies  of  reliability  physics.  The  total 
acceleration  factor  of  the  different  stress  combinations  will  be  the  product  of  the  acceleration 
factors  of  temperature  and  voltage. 
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This  acceleration  factor  model  is  widely  used  as  the  industry  standard  for  device 
qualification.  However,  it  only  approximates  a  single  dielectric  breakdown  type  of  failure 
mechanism  and  does  not  correctly  predict  the  acceleration  of  other  mechanisms. 

To  be  even  approximately  accurate,  however,  electronic  devices  should  be  considered  to 
have  several  failure  modes  degrading  simultaneously.  Each  mechanism  ‘competes’  with  the 
others  to  cause  an  eventual  failure.  When  more  than  one  mechanism  exists  in  a  system,  then  the 
relative  acceleration  of  each  one  must  be  defined  and  averaged  at  the  applied  condition.  Every 
potential  failure  mechanism  should  be  identified  and  its  unique  AE  should  then  be  calculated  for 
each  mechanism  at  a  given  temperature  and  voltage  so  the  EIT  rate  can  be  approximated  for  each 
mechanism  separately.  Then  the  final  EIT  will  be  the  sum  of  the  failure  rates  per  mechanism,  as 
is  described  by: 


EITtotai  =  FITi  +  EIT2+  ...  +EIT,  (4) 

whereby  each  mechanism  leads  to  an  expected  failure  unit  per  mechanism,  EITi.  Elnfortunately 
again,  individual  failure  mechanisms  are  not  uniformly  accelerated  by  a  standard  HTOE  test,  and 
the  manufacturer  is  forced  to  model  a  single  acceleration  factor  that  cannot  be  combined  with  the 
known  physics  of  failure  models. 

If  multiple  failure  mechanisms,  instead  of  a  single  mechanism,  are  assumed  to  be  time- 
independent  and  independent  of  each  other,  EIT  (constant  failure  rate  approximation)  should  be  a 
reasonable  approximation  for  realistic  field  failures.  Under  the  assumption  of  multiple  failure 
mechanisms,  each  will  be  accelerated  differently  depending  on  the  physics  that  is  responsible  for 
each  mechanism.  If,  however,  an  HTOE  test  is  performed  at  an  arbitrary  voltage  and  temperature 
for  acceleration  based  only  on  a  single  failure  mechanism,  then  only  that  mechanism  will  be 
accelerated.  In  that  instance,  which  is  generally  true  for  most  devices,  the  reported  EIT 
(especially  one  based  on  zero  failures)  will  be  meaningless  with  respect  to  other  failure 
mechanisms. 


III.  Acceleration  Factor 

The  qualification  of  device  reliability,  as  reported  by  a  FIT  rate,  must  be  based  on  an 
acceleration  factor,  which  represents  the  failure  model  for  the  tested  device.  If  we  assume  that 
there  is  no  failure  analysis  (FA)  of  the  devices  after  the  HTOE  test,  or  that  the  manufacturer  will 
not  report  FA  results  to  the  customer,  then  a  model  should  be  made  for  the  acceleration  factor, 
AF,  based  on  a  combination  of  competing  mechanisms.  This  will  be  explained  by  way  of 
example.  Suppose  there  are  two  identifiable,  constant  rate  competing  failure  modes  (assume  an 
exponential  distribution).  One  failure  mode  is  accelerated  only  by  temperature.  We  denote  its 
failure  rate  asA,(j’).  The  other  failure  mode  is  only  accelerated  by  voltage,  and  the 

corresponding  failure  rate  is  denoted  asy^  (u) . 

By  performing  the  acceleration  tests  for  temperature  and  voltage  separately,  we  can  get  the 
failure  rates  of  both  failure  modes  at  their  corresponding  stress  conditions.  Then  we  can  calculate 
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the  acceleration  factor  of  the  mechanisms.  If  for  the  first  failure  mode  we  have  A,  ( JJ ),  Aj  (72 )  and 
for  the  second  failure  mode,  we  havc/^  (^2  {^2 )  temperature  acceleration  factor  is: 


AF^  = 


Add 

At,)' 


T.<T, 


and  the  voltage  acceleration  factor  is: 


AF  =  V  <V 

The  system  acceleration  factor  between  the  stress  conditions  of  {T^,Vy  )and  (7^2,  F2  )is: 


AF  = 


\(T„V,)*  AT.Vi)  At^FAA 
At,.t,)*At,.v,)  AtFAt,) 


The  above  equation  can  be  transformed  to  the  following  two  expressions: 


affma 

ATz)  ,  AA 

AFj.  AFy 


(5) 

(6) 

(V) 

(8) 


or 

\(T,)AF,^A,(V,)AF, 

AtMAv,) 

Due  to  the  exponential  nature  of  acceleration  factor  as  a  function  of  V  or  T,  if  only  a  single 
parameter  is  changed,  then  it  is  not  likely  for  more  than  one  mechanism  to  be  accelerated 
significantly  compared  to  the  others  for  any  given  V  and  T.  As  we  will  see  in  the  next  section,  at 
least  4  mechanisms  should  be  considered.  Also,  the  various  voltage  and  temperature 
dependencies  must  be  considered  in  order  to  make  a  reasonable  reliability  model  for  electron 
devices.  Until  now,  the  assumption  of  equal  failure  probability  at-use  conditions  is  used  since  it 
is  the  most  conservative  approach  assuming  the  correct  proportionality  cannot  be  determined. 


IV.  Proportionality  Matrix  Solution 

The  basic  method  for  solving  the  system  of  equations  is  described  in  the  paper  from 
Bernstein  [2]  and  using  the  suggestion  of  a  Sum-of-failure-rate  method  as  described  in  JEDEC 
Standard  JEP122G  [1]  as  published  in  a  more  recent  paper  by  Bernstein  [3].  The  matrix  method 
forms  the  basis  for  this  work.  It  is  clear  that  the  manufacturers  of  electronic  components 
recognize  the  importance  of  combining  failure  mechanisms  in  a  sum-of-failure-rates  method. 
Also,  the  formula  for  each  mechanism  is  well  studied  and  published. 

Thus,  we  describe  here,  the  prediction  of  a  system  reliability  using  a  linear  matrix  solution. 
Although  until  today,  we  have  only  verified  the  methodology  on  verifiable  microelectronic 
device  failure  mechanism,  our  methodology  will  apply  directly  to  additional  mechanisms 
including  thermal  and  mechanical  stresses  due  to  wafer  bonding  and  any  failure  mechanism  that 
can  be  modelled  by  physics  of  failure;  including  wide  bandgap  semiconductors  and  even 
packaging  failures. 

This  approach  allows  accelerated  testing  to  be  performed  at  increased  voltages,  temperature, 
frequency  and  power  levels  and  even  mechanical  stresses  and  thermal  cycles  to  increase  the 
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separation  of  individual  mechanisms  in  order  to  calibrate  this  matrix  to  actual  components  in  a 
system.  The  matrix  is  then  solved  using  input  from  multiple  accelerated  tests  as  compared  to  the 
relative  contribution  of  each  assumed  mechanism.  This  approach  requires  multiple,  High 
Temperature  Overstress  Life-tests  (M-HTOL)  in  order  to  accelerate  different  mechanisms  in  the 
same  set  of  accelerated  tests.  This  M-HTOL  test  allows  calculations  that  consider  all  conditions 
simultaneously.  Thus,  an  appropriate  failure  rate  calculation  will  determine  the  failure  rate  during 
actual  operating  conditions.  Furthermore,  a  system  can  be  de-rated  for  increased  robust  design 
and  prolonged  failure-free  operation.  This  is  accomplished  by  solving  the  matrix  assuming  any 
desired  stress  condition  using  the  same  proportionality  factors  as  determined  by  the  M-HTOL 
test.  We  will  add  thermo-mechanical  and  additional  stresses  related  to  packaging  failures  using 
this  same  methodology. 

As  part  of  calibrating  the  proportionality  factors,  accelerated  test  results  can  be  used  as  input 
to  calculated  failure  rates  for  all  the  failure  mechanisms.  The  output  of  accelerated  life  test 
determines  the  proportional  acceleration  factors  for  each  of  the  various  mechanisms.  We  assume 
the  circuit  itself  is  what  determines  the  relative  contribution  of  each  mechanism,  so  a  matrix  is 
constructed  based  on  the  physics  models  (JEDEC  [1]  or  manufacturer  based)  solved  for  the 
experimental  results.  We  assume  that  any  test  is  performed  with  a  specific  set  of  conditions, 
which  determines  a  specific  failure  rate  that  would  lead  to  a  failure. 

This  matrix,  when  solved  for  relative  contributions  at  each  set  of  conditions,  becomes  a 
forecasting  tool  that  allows  determining  the  dominance  of  each  failure  mechanism  and  its  relative 
contribution  to  the  chance  occurrence  of  a  system  failure.  By  solving  a  system  of  equations 
whose  information  can  be  obtained  from  the  matrix,  one  can  make  an  assessment  and  prediction 
of  acceleration  for  each  combination  of  failure  mechanism  and  its  proportion  in  the  circuit.  This 
model  assumes  a  constant  total  failure  rate  so  the  time  at  which  a  given  percentage  will  fail  can 
be  used  to  calculate  the  duration  of  the  warranty  period  and  the  approximate  lifetime  of  the 
component.  The  matrix  is  described  in  Table  1. 


TDDB 

HCI 

NBTI 

EM 

Results 

mx 

XBi 

y-c, 

ZD, 

1 

MTTfi 

m2 

XB2 

YC2 

ID2 

1 

MTTF2 

V3.T3 

XB3 

YCs 

ZD3 

1 

MTTF} 

V4J4 

m^ 

XB4 

YQ 

ZD4 

1 

M7TF4 

Table  1.  M-HTOL  Matrix  used  to  solve  models  with  measured  times  to  fail. 


Each  row  describes  various  operating  conditions  under  which  the  system  is  tested.  Each 
experiment,  i,  is  operated  with  its  unique  voltage  and  temperature.  The  ‘results’  column,  MTTEi 
is  the  average  time  when  the  failure  occurs  under  the  experimental  condition,  which  is  associated 
with  a  pre-determined  failure  point.  Our  example  will  use  5%  performance  degradation  as  the 
failure  point,  however  any  reasonable  value  will  work  as  long  as  it  is  consistent  with  the 
application.  The  result,  1/MTTEi  is  a  failure  rate  1  and  measured  as  the  EIT,  reported  as 
10^/MTTF.  This  approach  assumes  that  each  mechanism  follows  a  constant  failure  rate  that  is 
time  independent.  That  is  to  say  that  the  FIT  or  MTTF  completely  describes  the  reliability  of 
each  failure  mechanism  as  well  as  the  whole  system.  A  full  justification  is  beyond  the  scope  of 
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this  paper,  but  is  well  explained  in  the  book  by  Bernstein  [4].  In  short,  since  the  whole  system  is 
regarded  as  having  a  constant  failure  rate,  we  may  treat  each  mechanism  as  having  an  average 
constant  rate  that  is  accelerated  by  the  applied  conditions. 

The  left  hand  side  of  the  matrix,  then,  specifies  the  acceleration  of  each  mechanism  at  the 
tested  operating  conditions  while  measured  experimental  results  comprise  the  right  hand  side  as 
seen  in  Table  2.  Each  column  in  the  matrix  represents  a  different  failure  mechanism  while  the 
row  represents  the  relative  acceleration  for  each  V,  T  and  frequency.  We  assume  that  each 
mechanism  (A-D)  affects  the  system  linearly  with  its  own  acceleration  factor  (AF)  at  the  given 
test  conditions.  The  Acceleration  factor  formulae  are  calculated  as  the  solution  that  fit  the 
experimental  condition  of  each  result  on  the  right  hand  side.  Thus,  any  failure  mechanism  will 
have  a  different  value  for  each  experiment,  depending  on  the  test  conditions.  We  then  solve  the 
matrix  to  find  a  set  of  constants.  Pi,  shown  here  as  W-Z,  across  the  whole  matrix  that  matches  the 
experimental  results  with  calculated  acceleration  factors. 


AF  P, 


A\  B]  Cl  Di 

‘W 

■Ai‘ 

Aj  62  G  D2 

X 

^■2 

Aj  63  C3  D3 

Y 

h 

A4  64  C4  D4 

Z 

y-4 . 

(AF)  ■  (Pi)  =  (;.)  -  (Pi)  =  (AF)- ' .  (;.) 


Table  2:  Demonstration  of  the  Matrix  Solution  Method. 

Knowledge  of  these  coefficients  allows  prediction  of  the  MTTF  or  the  FIT  for  any  other 
work  conditions  that  were  not  tested  and  give  an  accurate  prediction  of  the  reliability  of  the 
device  under  different  conditions.  The  multiple  life-tests  performed  are  what  comprise  the 
multiple-HTOF  (M-HTOF)  testing  and  provides  actual  data  to  calibrate  the  expected  reliability. 
The  result  is  a  meaningful  value  for  the  failure  rate  as  measured  in  FIT. 

When  designing  the  M-HTOF  test,  it  is  important  that  each  individual  mechanism  is 
accelerated  more  than  the  others  in  at  least  one  of  the  tests  so  that  there  is  a  reasonable  calibration 
between  the  mechanisms.  For  example,  if  within  the  testing  extremes,  the  relative  contribution  of 
one  mechanism  is  never  seen,  then  the  inclusion  of  that  mechanism  may  confuse  the  result  and  it 
would  be  best  not  to  include  that  as  part  of  the  model.  For  our  example,  we  found  that  within  the 
parameters  of  Voltage,  Temperature  and  Frequency,  we  were  unsuccessful  to  accelerate  time- 
dependent  dielectric  breakdown  (TDDB)  beyond  NBTI  or  any  other  mechanism.  Hence,  we 
solved  the  full  matrix  as  3X3  including  only  the  mechanisms  that  included  significant 
contribution  within  the  testing  parameters. 

In  order  to  apply  this  methodology  to  packaging,  where  electrical  considerations  may  be  less 
important  than  the  thermo-mechanical  or  environmental  stresses,  we  will  determine  final  tests 
that  will  cause  intentional  failures  due  to  each  of  the  studied  mechanisms.  The  goal  of  an 
accelerated  test  will  be  to  study  the  potential  failure  mechanisms  that  would  occur  due  to  each 
new  technology  that  is  developed  and  model  those  physics  of  failure  so  that  a  final  test  matrix 
can  be  developed  to  include  at  least  one  failure  during  final  test  so  that  a  minimum  design  of 
experiments  will  allow  our  matrix  solution  to  predict  the  expected  failure  rate  under  user  defined 
conditions. 
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The  matrix  approach  we  use  to  model  useful  life  failure  rate  (FIT)  for  components  in 
electronic  assemblies  by  assuming  each  component  is  composed  of  multiple  sub-components,  for 
example;  a  certain  percentage  is  effectively  ring-oscillator,  static  or  dynamic  random-access 
memory  (SRAM  or  DRAM).  Each  type  of  circuit,  based  on  its  operation,  can  be  seen  to  affect 
the  potential  steady-state  (defect  related)  failure  mechanisms  differently  based  on  the  accelerated 
environment,  for  example;  Electromigration,  Hot-Carrier,  NBTI,  etc.  Hence,  the  standard  system 
reliability  EIT  can  be  modeled  using  traditional  MIL-handbook-217  type  of  algorithms  and 
adapted  to  known  system  reliability  tools,  however,  instead  of  treating  each  component  as 
individuals,  we  propose  treating  each  complex  component  as  a  series  system  of  sub-components, 
each  with  its  own  reliability  matrix.  This  matrix  can  then  be  solved  at  any  given  set  of  conditions, 
i.e.  voltage,  temperature  and  frequency,  as  a  percentages  at  each  stressed  operating  condition, 
there  is  a  unique  proportion  of  each  mechanism  for  a  given  set  of  stressed  conditions  that  will 
result  in  the  given  time  to  fail. 

In  order  to  find  a  relative  failure  rate  for  each  mechanism,  we  take  the  accelerated  life  test  at 
various  voltages  and  temperatures  and  extrapolate  to  an  end-of-life  time  at  each  temperature  and 
voltage  condition.  Eor  each  condition,  a  consistent  failure  criterion  must  be  chosen  and  the  times 
to  reach  that  degraded  state  yields  “Time  To  Tail”  (TTF)  for  that  set  of  voltage  and  temperature. 
Since  the  relative  degradation  is  measured  as  the  percentage  change  in  ring-oscillator  frequency, 
the  time  to  fail  is  recorded  as  time  to  5%  degradation,  giving  the  results  as  seen  in  Table  3. 


Volt 

u 

0 

1- 

Freq  (MHz) 

EM 

HCI 

NBTI 

Measured 

FIT 

Calculated 

FIT 

2.4 

-20 

500 

0.00% 

100.00% 

0.00% 

8.00E+04 

8.00E+04 

1.2 

140 

500 

100.00% 

0.00% 

0.00% 

5.40E-01 

5.40E-01 

2.4 

160 

0.02 

0.58% 

3.40% 

96.02% 

1.78E-02 

1.78E-02 

3 

0 

500 

0.00% 

81.16% 

18.84% 

3.45E+07 

3.43E+07 

2.4 

173 

500 

35.19% 

61.50% 

3.31% 

1.76E+01 

1.73E+01 

2.4 

160 

500 

14.56% 

85.34% 

0.10% 

1.50E+01 

1.77E+01 

3 

0 

0.20 

0.00% 

0.02% 

99.98% 

6.37E+06 

6.47E+06 

Table  3.  Test  Results  showing  proportions  of failure  mechanisms  for  given  V,T  and  F  compared  with 
the  calculated  as  well  as  the  measured  failure  rate  (FIT). 


An  absolute  FIT  value  is  determined  in  the  next  row  based  on  the  mean  time  to  fail.  This 
allows  calibration  of  the  final  results  in  operation.  The  column  line  is  the  expected  FIT  (failures 
per  billion  part-hours)  at  those  conditions.  By  substituting  these  percentages  into  the  matrix,  the 
true  acceleration  factors  are  determined  for  not  only  the  tested  condition  but  also  for  any 
extrapolated  condition.  A  calculated  reliability  curve  is  shown  in  Figure  1  showing  the  full  range 
of  expected  FIT  versus  Temperature  for  any  set  of  operational  conditions  shown  in  Table  4. 
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Volt 

T°C 

F 

(MHz) 

EM 

HCI 

NBTI 

Calculated  FIT 

1.1 

30 

0.02 

99.76% 

0.24% 

0.00% 

2.87E-10 

1.2 

70 

500 

100.00% 

0.00% 

0.00% 

9.88E-04 

1.5 

80 

500 

98.58% 

1.42% 

0.00% 

3.00E-03 

2 

-50 

500 

0.00% 

100.00% 

0.00% 

3.12E+03 

1.8 

125 

500 

98.23% 

1.77% 

0.00% 

1.86E-01 

2 

150 

0.02 

96.20% 

3.80% 

0.00% 

5.16E-05 

Table  4.  Calculated  FIT  based  on  the  solved  matrix  for  typical  use  conditions. 

A  calculated  reliability  curve  is  shown  in  Figure  1  showing  the  full  range  of  expected  FIT  versus 
Temperature  for  any  set  of  operational  conditions  based  on  the  matrix  of  Table  4.  A  full  range  of 
temperatures,  frequencies  and  core  voltage  is  substituted  into  the  appropriate  equations  based  on 
the  proportionality  solution  from  the  results  of  Table  3. 


FIT  VS  temperature  for  different  voltages  and  different  frequencies 

. 1.5V0.5GHZ  1.5V0.1GH2  —  1.5V2GH2  —  -IVlGHz  -  -  1.7V1GH2 

-60  -10 

Temp  °C 

Figure  1.  Failure  Rate  (FIT)  calculations  versus  Temp,  for  a  variety  of  Voltages  and  Frequencies. 

The  unique  solution  that  solves  all  the  equations  with  the  extrapolated  acceleration  factors 
gives  a  percentage  contribution  for  each  of  the  failure  mechanisms.  We  report  the  reliability  as 
FIT,  which  is  10^/MTTF  for  each  condition.  The  percentages  for  each  mechanism  are  shown, 
based  on  the  relative  contributions  that  were  extrapolated  from  the  physics  of  failure  equations 
normalized  to  the  measured  FIT  of  each  test. 

The  most  important  result  from  our  study  is  that  Electromigration  and  HCI  are  the  most 
dominant  failure  mechanisms  throughout  the  useful  range  of  device  operation.  This  is  surprising 
since  the  standard  HTOL  test  emphasizes  only  TDDB  and  BTI  since  those  are  most  accelerated 
by  high  voltage  and  temperature,  however  under  use  conditions,  the  other  two  are  most 
important.  Furthermore,  it  is  important  to  see  that  at  very  low  temperature  and  high  frequency, 
HCI  is  the  most  important  failure  mechanism  and  this  could  have  very  important  implications  for 
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satellite  and  low-temperature  military  applications.  Fortunately,  very  low  FIT  values  are  found 
and  reliability  is  predicted  confidently. 


V,  Summary 

We  present  here  a  simple  and  accurate  way  to  combine  the  physics  of  failure  equations  for 
reliability  prediction  from  accelerated  life  testing.  We  show  that  a  matrix  approach  allows  the 
reliability  physics  equations  to  be  fit  proportionally  to  the  results  of  monitored  accelerated  life 
testing  in  order  to  extrapolate  failure  rate  one  would  expect  given  actual  operating  parameters. 
This  methodology  can  be  extended  to  include  radiation  effects,  frequency  and  even  packaging 
and  solder  joint  effects  to  give  a  complete  system  reliability  evaluation  framework.  This  matrix 
gives  a  very  cost-effective  way  to  predict  reliability  based  on  the  Physics  of  Failure  using  only  3 
tests  as  compared  to  the  normal  single-mechanism  approach. 
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